Goto

Collaborating Authors

 failure probability


Proposal-Guided Greedy Surrogate Refinement for PDE-Driven High-Dimensional Rare-Event Estimation

arXiv.org Machine Learning

Accurate surrogate construction for PDE-driven high-dimensional rare-event simulation is challenging when performance evaluations are expensive. Since a globally accurate surrogate may require many high-fidelity evaluations, adaptive importance sampling provides a natural localization tool: its evolving proposal distribution progressively identifies the failure-relevant region. Motivated by this observation, we propose a surrogate-assisted adaptive importance sampling framework that refines the surrogate locally along the evolving proposal, rather than over the entire input space. The surrogate combines an encoder with a neural network, providing a low-dimensional latent representation for both prediction and sample selection. At each adaptive iteration, candidates drawn from the current proposal are selected by a greedy latent-space rule balancing proximity to the estimated failure boundary and sample diversity. The selected samples are evaluated by the high-fidelity model and used to refine the surrogate, which then guides the subsequent cross-entropy-type adaptive proposal update. We establish one-step proposal stability bounds under local surrogate errors, together with surrogate-induced misclassification and finite-sample estimation error bounds. Numerical experiments on multimodal benchmarks and PDE-driven rare-event problems up to 100 dimensions show that the proposed method achieves accuracy comparable to true-model adaptive importance sampling while requiring substantially fewer high-fidelity evaluations.


Verification Based Solution for Structured MAB Problems

Neural Information Processing Systems

We consider the problem of finding the best arm in a stochastic Multi-armed Bandit (MAB) game and propose a general framework based on verification that applies to multiple well-motivated generalizations of the classic MAB problem. In these generalizations, additional structure is known in advance, causing the task of verifying the optimality of a candidate to be easier than discovering the best arm. Our results are focused on the scenario where the failure probability must be very low; we essentially show that in this high confidence regime, identifying the best arm is as easy as the task of verification. We demonstrate the effectiveness of our framework by applying it, and matching or improving the state-of-the art results in the problems of: Linear bandits, Dueling bandits with the Condorcet assumption, Copeland dueling bandits, Unimodal bandits and Graphical bandits.




Supplementary Material

Neural Information Processing Systems

For a vector x 2 Rd and H [d], we denote vH to denote the vector that is equal to v on i 2 H, and zero otherwise. For a real-valued random variable X and m 2 N, we use kXkLm to denote (E|X|m)1/m. For a set S Rd and a function f, we also define the set function notation f(S) as {f(x)|x 2 S}. A.1 Finding a stable subset from a stable weighted subset For a set S on npoints, we define n, as the set of weights w 2 Rn such that wi 2 [0,1/((1)n]for all i 2 [n]and P i wi =1 . For a fixed vector ยต 2 Rd that will be clear from context, a set of npoints S = {x1,...,x n}, and weights w 2 n, over S, we use w to denote P i wi(xi ยต)(xi ยต)>. The goal of this section is to show Proposition A.1, which states that if we have a weight w over S such that w (with respect to some vector ยต) has bounded Xk norm proportional to 2 for some > 0, then there must exists some large subset S0 S that is stable with respect to ยต and . Let S be a set of n points in Rd. Suppose that there exists a w 2 n, such that k wkXk B 2 for some vector ยต. Then there exists a subset S0 S such that (i)|S0| (1 2)n and (ii) S0 is (,,k)-stable with respect to ยต and, where = O( p B +1). Observe that k wkXk B 2 implies k w 2IkXk (B +1) 2 by the triangle inequality. In order to show Proposition A.1, we show Lemma A.2, which is a weakening of Proposition A.1 where we additionally assume that ยตw = P i wixi is close to ยต, where ยต is the vector we use to define w as well as the vector that we want to find a large sample subset S0 to be stable with respect to. To use Lemma A.2, we additionally show Proposition A.4, which states that k wkXk B 2 is enough to imply that ยตw is close to ยต.


Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions

Neural Information Processing Systems

We study the fundamental task of outlier-robust mean estimation for heavy-tailed distributions in the presence of sparsity. Specifically, given a small number of corrupted samples from a high-dimensional heavy-tailed distribution whose mean ยต is guaranteed to be sparse, the goal is to efficiently compute a hypothesis that accurately approximates ยตwith high probability. Prior work had obtained efficient algorithms for robust sparse mean estimation of light-tailed distributions. In this work, we give the first sample-efficient and polynomial-time robust sparse mean estimator for heavy-tailed distributions under mild moment assumptions. Our algorithm achieves the optimal asymptotic error using a number of samples scaling logarithmically with the ambient dimension. Importantly, the sample complexity of our method is optimal as a function of the failure probability, having an additive log(1/) dependence. Our algorithm leverages the stability-based approach from the algorithmic robust statistics literature, with crucial (and necessary) adaptations required in our setting. Our analysis may be of independent interest, involving the delicate design of a (non-spectral) decomposition for positive semi-definite matrices satisfying certain sparsity properties.


High-dimensional reliability-based design optimization using stochastic emulators

arXiv.org Machine Learning

Reliability-based design optimization (RBDO) is traditionally formulated as a nested optimization and reliability problem. Although surrogate models are generally employed to improve efficiency, the approach remains computationally prohibitive in high-dimensional settings. This paper proposes a novel RBDO framework based on a stochastic simulator viewpoint, in which the deterministic limit-state function and the uncertainty in the model inputs are combined into a unified stochastic representation. Under this formulation, the system response conditioned on a given design is modeled directly through its output distribution, rather than through an explicit limit-state function. Stochastic emulators are constructed in the design space to approximate the conditional response distribution, enabling the semi-analytical evaluation of failure probabilities or associated quantiles without resorting to Monte Carlo simulation. Two classes of stochastic emulators are investigated, namely generalized lambda models and stochastic polynomial chaos expansions. Both approaches provide a deterministic mapping between design variables and reliability constraints, which breaks the classical double-loop structure of RBDO and allows the use of standard deterministic optimization algorithms. The performance of the proposed approach is evaluated on a set of benchmark problems with dimensionality ranging from low to very high, including a case with stochastic excitation. The results are compared against a Kriging-based approach formulated in the full input space. The proposed method yields substantial computational gains, particularly in high-dimensional settings. While its efficiency is comparable to Kriging for low-dimensional problems, it significantly outperforms Kriging as the dimensionality increases.


The Geometry of Efficient Nonconvex Sampling

arXiv.org Machine Learning

We present an efficient algorithm for uniformly sampling from an arbitrary compact body $\mathcal{X} \subset \mathbb{R}^n$ from a warm start under isoperimetry and a natural volume growth condition. Our result provides a substantial common generalization of known results for convex bodies and star-shaped bodies. The complexity of the algorithm is polynomial in the dimension, the Poincarรฉ constant of the uniform distribution on $\mathcal{X}$ and the volume growth constant of the set $\mathcal{X}$.